HOME | ABOUT ME | LAB | RESEARCH | TEACHING

Bioinformatics Data Skills

Utah Valley University - BIOL490R (Special Topics)

Course Syllabus

Course file repository

Shared Course Notes

(Anyone with link can edit)


Table of Contents

Week 1 | Week 5 | Week 9 | Week 13

Week 2 | Week 6 | Week 10 | Week 14

Week 3 | Week 7 | Week 11 | Week 15

Week 4 | Week 8 | Week 12 | Week 16


Command Line Projects and the Unix Philosophy

Week 1

Ideology of ‘Robust and Reproducible’ Bioinformatics

Topics:

Assignments:

  • Work through BDS Chapter 1
  • Assignment 1 - Reflection piece on why you want to learn command line skills and best practices
  • Set up your computer environment

Resources

Practice

Back to top of page


Week 2

Proper Project Organization

Topics:

Assignments:

  • Work through BDS Chapter 2
  • Assignment 2 - Create oganized project template using code

Resources

Practice

Back to top of page

Unix refresher and sequence data types

Week 3

Remedial Unix Shell

Topics:

Assignments:

  • Work through BDS Chapter 3
  • Assignment 3 - Use pipes and redirects

Resources

Practice

Back to top of page


Week 4

Working with Sequence Data

Topics

Assignments:

  • Work through BDS Chapter 10
  • Assignment 4 - Trim reads, Count nucleotides, convert from fastq to fasta

Resources

Practice

Back to top of page

Using Existing Tools in the Command Line

Week 5

Combining Unix Skills and Command-Line Software

Topics:

Assignments:

  • Case study 1 - Run ITSxpress on fungal data
    • keep complex results
    • customize output for question at hand
    • make it reproducible
    • do it on a hundred files
    • store all log data into one file - for how many sequences TOTAL were no ITS stop or start sites identified?
    • push “workflow.txt” (not data)
    • uses chapters: 2,3,10 (for+loop, grep, redirect 2>, flags)

Resources

Practice

Back to top of page

More Powerful Unix Tools

Week 6

Unix Data Tools

Topics:

Assignments:

  • Work through BDS Chapter 7

Resources

Practice

Back to top of page


Week 7

Unix Data Tools, Continued

Topics:

Assignments:

  • Continue working through BDS Chapter 7
  • Assignment 5 - build tabular file from fasta database

Resources

Practice

Back to top of page

Finding and Retrieving Data

Week 8

Online Repositories and Approaches to Downloading

Topics:

Assignments:

  • Work through BDS Chapter 6
  • Assignment 6 - download stuff with ftp, curl, Edirect, sra-toolkit
  • Case Study 2 - Reproducibly downloading stuff (BDS p. 120)

    • Full documentation
    • Checksums
    • Markdown README

Resources

Practice

Back to top of page

Working with Supercomputers

Week 9

Interfacing with Remote Machines

Topics:

Assignments:

  • Work through BDS Chapter 4
  • Assignment 7 - build 3 separate SLURM scripts to run fasta analyses

Resources

Practice

Back to top of page


Week 10

Interfacing with Remote Machines, Continued

Topics:

Assignments:

  • Case study 3 - Push a large compute job to a remote machine and retrieve the results

    • Main script and data will be provided
    • We need onboarding training for U of U cluster
    • Each student needs their own account
    • uses chapters: 1,2,3,4

Resources

Practice

Back to top of page

Version Control and Collaborations

Week 11

Git for Scientists

Topics:

Assignments:

  • Work through BDS Chapter 5
  • Assignment 8 - Git collaboration and merge

Resources

Practice

Back to top of page


Week 12

Bioinformatics Shell Scripting

Topics:

Assignments:

  • Work through BDS Chapter 12
  • Assignment 9 - Git collaboration and merge-

Resources

Practice

Back to top of page

Putting it all together

Week 13

Composing Full Pipelines

Topics:

Assignments:

  • Continue working through BDS Chapter 12

Resources

Practice

Back to top of page


Week 14

Running a Pipeline on a Remote Machine

Topics:

Assignments:

  • Case Study 3 - Assemble a metagenome on the remote cluster

    • metaSPADEs
    • classify reads with DIAMOND?

Resources

Practice

Back to top of page


Week 15

Creating a Custom Bioinformatics Tool

Topics:

Assignments:

  • Case Study 4 - Download NCBI marker genes and use Unix tools to build a custom QIIME-compatible reference database

    • Reingineer https://github.com/gzahn/Format_NCBI_QIIME
    • Edirect (command-line version of NCBI search tool)
    • ftp, BLAST, NCBI, data cleaning and reformatting
    • Turn into a completely reproducible and portable script
    • requires entrez_qiime.py installation and use
    • push tool to GitHub
    • uses chapters: 2,3,6,7,10,12,5

Topics

  • Intro to genetic data in R

Assignments

  • Work on Final Project
  • Assignment 10 (working with DNA data in R)

Back to top of page


Week 16

Where to go from here?

Topics:

Assignments:

  • Assignment 10 - Reflection piece on what you’ve learned and what next steps you’ll take

Back to top of page